8 research outputs found
Efficient Neural Ranking using Forward Indexes and Lightweight Encoders
Dual-encoder-based dense retrieval models have become the standard in IR.
They employ large Transformer-based language models, which are notoriously
inefficient in terms of resources and latency. We propose Fast-Forward indexes
-- vector forward indexes which exploit the semantic matching capabilities of
dual-encoder models for efficient and effective re-ranking. Our framework
enables re-ranking at very high retrieval depths and combines the merits of
both lexical and semantic matching via score interpolation. Furthermore, in
order to mitigate the limitations of dual-encoders, we tackle two main
challenges: Firstly, we improve computational efficiency by either
pre-computing representations, avoiding unnecessary computations altogether, or
reducing the complexity of encoders. This allows us to considerably improve
ranking efficiency and latency. Secondly, we optimize the memory footprint and
maintenance cost of indexes; we propose two complementary techniques to reduce
the index size and show that, by dynamically dropping irrelevant document
tokens, the index maintenance efficiency can be improved substantially. We
perform evaluation to show the effectiveness and efficiency of Fast-Forward
indexes -- our method has low latency and achieves competitive results without
the need for hardware acceleration, such as GPUs.Comment: Accepted at ACM TOIS. arXiv admin note: text overlap with
arXiv:2110.0605
A Review of the Role of Causality in Developing Trustworthy AI Systems
State-of-the-art AI models largely lack an understanding of the cause-effect
relationship that governs human understanding of the real world. Consequently,
these models do not generalize to unseen data, often produce unfair results,
and are difficult to interpret. This has led to efforts to improve the
trustworthiness aspects of AI models. Recently, causal modeling and inference
methods have emerged as powerful tools. This review aims to provide the reader
with an overview of causal methods that have been developed to improve the
trustworthiness of AI models. We hope that our contribution will motivate
future research on causality-based solutions for trustworthy AI.Comment: 55 pages, 8 figures. Under revie
Learning Faithful Attention for Interpretable Classification of Crisis-Related Microblogs under Constrained Human Budget
The recent widespread use of social media platforms has created convenient ways to obtain and spread up-to-date information during crisis events such as disasters. Time-critical analysis of crisis data can help human organizations gain actionable information and plan for aid responses. Many existing studies have proposed methods to identify informative messages and categorize them into diferent humanitarian classes. Advanced neural network architectures tend to achieve state-of-the-art performance, but the model decisions are opaque. While attention heatmaps show insights into the model’s prediction, some studies found that standard attention does not provide meaningful explanations. Alternatively, recent works proposed interpretable approaches for the classifcation of crisis events that rely on human rationales to train and extract short snippets as explanations. However, the rationale annotations are not always available, especially in real-time situations for new tasks and events. In this paper, we propose a two-stage approach to learn the rationales under minimal human supervision and derive faithful machine attention. Extensive experiments over four crisis events show that our model is able to obtain better or comparable classifcation performance (∼86% Macro-F1) to baselines and faithful attention heatmaps using only 40-50% human-level supervision. Further, we employ a zero-shot learning setup to detect actionable tweets along with actionable word snippets as rationales
Supervised Contrastive Learning Approach for Contextual Ranking
Contextual ranking models have delivered impressive performance improvements
over classical models in the document ranking task. However, these highly
over-parameterized models tend to be data-hungry and require large amounts of
data even for fine tuning. This paper proposes a simple yet effective method to
improve ranking performance on smaller datasets using supervised contrastive
learning for the document ranking problem. We perform data augmentation by
creating training data using parts of the relevant documents in the
query-document pairs. We then use a supervised contrastive learning objective
to learn an effective ranking model from the augmented dataset. Our experiments
on subsets of the TREC-DL dataset show that, although data augmentation leads
to an increasing the training data sizes, it does not necessarily improve the
performance using existing pointwise or pairwise training objectives. However,
our proposed supervised contrastive loss objective leads to performance
improvements over the standard non-augmented setting showcasing the utility of
data augmentation using contrastive losses. Finally, we show the real benefit
of using supervised contrastive learning objectives by showing marked
improvements in smaller ranking datasets relating to news (Robust04), finance
(FiQA), and scientific fact checking (SciFact)
Extractive Explanations for Interpretable Text Ranking
Neural document ranking models perform impressively well due to superior language understanding gained from pre-Training tasks. However, due to their complexity and large number of parameters these (typically transformer-based) models are often non-interpretable in that ranking decisions can not be clearly attributed to specific parts of the input documents.In this article, we propose ranking models that are inherently interpretable by generating explanations as a by-product of the prediction decision. We introduce the Select-And-Rank paradigm for document ranking, where we first output an explanation as a selected subset of sentences in a document. Thereafter, we solely use the explanation or selection to make the prediction, making explanations first-class citizens in the ranking process. Technically, we treat sentence selection as a latent variable trained jointly with the ranker from the final output. To that end, we propose an end-To-end training technique for Select-And-Rank models utilizing reparameterizable subset sampling using the Gumbel-max trick.We conduct extensive experiments to demonstrate that our approach is competitive to state-of-The-Art methods. Our approach is broadly applicable to numerous ranking tasks and furthers the goal of building models that are interpretable by design. Finally, we present real-world applications that benefit from our sentence selection method.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information System